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Abstract. We analyze adaptive mesh- refining algorithms for conforming finite element 
discretizations of certain non-linear second-order partial differential equations. We allow 
continuous polynomials of arbitrary, but fixed polynomial order. The adaptivity is driven 
by the residual error estimator. We prove convergence even with optimal algebraic 
convergence rates. In particular, our analysis covers general linear second-order elliptic 
operators. Unlike prior works for linear non-symmetric operators, our analysis avoids 
the interior node property for the refinement, and the differential operator has to satisfy 
a Garding inequality only. If the differential operator is uniformly elliptic, no additional 
assumption on the initial mesh is posed. 



1. Introduction 

Let fl he a bounded polyhedral Lipschitz domain in Mf^, d > 2. We consider a homo- 
geneous Dirichlet boundary value problem for a certain non-linear second-order elliptic 
partial differential equation (PDE) 

Cu{x) := —div(^A{x,Vu)) + g{x,u,Vu) = f{x) in f2, (la) 

u = on r := dn. (lb) 

The differential operator C = A + IC is split into a principal part Au = — div(A(-, Vn)) 
and a compact perturbation JCu = g{-, u, Vu), see Subsection 6.5 for the precise regularity 
assumptions. This framework also includes the case of general linear second-order elliptic 
operators 

Cu:= -div{AVu) + b-Vu + cu. (2) 
We consider a common adaptive mesh-refining algorithm which iterates the following loop 

(3) 



solve 



estimate 



mark 



refine 



The module solve computes a piecewise polynomial finite element approximation f/^ of 
u with respect to a given mesh Te- For estimate, we use a residual error estimator, see 
e.g. [3, 29]. Next, the Dorfier marking criterion [14] is used to single out elements for 
refinement. Finally, refine leads to a locally refined and improved mesh Te+i by means 
of the newest vertex bisection algorithm (NVB). 

So far, available results on convergence and quasi-optimality of adaptive finite element 
methods (AFEM) from the literature essentially dealt with the linear, symmetric, and 
elliptic case (2) with b = and c > 0, see e.g. [6, 8, 11, 14, 18, 27] and the references 
therein. As far as the linear and non-symmetric case 6 7^ is concerned, we are only 
aware of the works [12, 19] which, however, considered the special situation div6 = and 
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c > 0. Moreover, their analysis requires the interior node property for the refinement 
at least after a fixed number of steps, which has been introduced in [21] to guarantee a 
discrete lower bound for the error. Finally, the proofs of convergence and quasi-optimality 
in [12, 19] assume the initial mesh To to be sufficiently fine although the assumption 
divb = already ensures ellipticity of the associated bilinear form &(■,■) in the weak 
formulation of (1), i.e. the operator C in (2) is uniformly elliptic. All this is different 
to the present work, and the advances over the state of the art, see e.g. [11, 12, 18], are 
fourfold: 

(i) In the linear case (2), our assumptions on the data A = A{x), b = b{x), and 
c = c{x) only ensure that the bilinear form &(■,■) of the weak formulation of (1) 
is continuous and satisfies a Garding inequality on Hq{Q). 

(ii) As for the symmetric case [11], we only rely on standard newest vertex bisection, 
and the interior node property is avoided. 

(iii) If b{-, ■) is elliptic, we avoid any assumption on the initial mesh To- If b{-, ■) 
satisfies a Garding inequality, we require the same assumption on the initial mesh 
as [12, 19] to ensure well-posedness of the finite element formulations. 

(iv) To the best of the authors' knowledge and besides [5] for the particular p-Laplace 
problem, this work provides the first quasi-optimality result for a class of non- 
linear problems. 

From a technical point of view, our analytical argument works as follows and is illustrated 
for the linear operator C from (2) with induced bilinear form &(■,■): First, the estimator 
reduction 

vl,<qVe+C\\\Ue+i-US' (4) 

together with a Cea-type quasi-optimality already implies convergence [/^ — )■ u as ^ — )■ oo 
(Proposition 4), see also [2] for this estimator reduction principle. Here, < g < 1 and 
C > are generic constants, and ||| ■ ||| denotes the energy quasi-norm induced by b{-, ■). 
Second, the novel contribution in our analysis is that this additional knowledge allows us 
to prove a quasi-Pythagoras theorem 

\\\Ue+i - \J,f + |||n - f/f+i|f < ^ |||n - U,f (5) 

for all e > and i > io{e) sufficiently large (Proposition 7) which unlike [12, 19] avoids 
any additional assumption on the mesh-size of Te- With estimator reduction (4) and 
quasi-orthogonality (5) at hand, we next observe /2-linear convergence 

Ve+k < Cq^T]i for all A; G N (6) 

of the error estimator (Theorem 8) with further generic constants C > and < g < 1. 
Finally, the i?-linear convergence (6) suffices to follow the paths of [27, 11] to prove even 
quasi-optimal convergence rates in the sense of 

{uJ)eK ^ m<C{n-i^%)-' foraineN, (7) 

i.e. each theoretically possible convergence order 0{N~'^) for the error estimator will 
asymptotically be achieved by AFEM. The approximation class involved in (7) is de- 
fined in Section 5. By means of reliability and efficiency of the error estimator rji used, 
this quasi-optimality result can equivalently be stated in terms of error plus oscillations 
as is done in [11, 12, 18, 27]. As has first been observed in [1], our approach and anal- 
ysis, however, fully avoids the use of lower bounds for the error, i.e. all constants are 
independent of the efficiency estimate. 
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For the nonlinear problem (1), we observe that estimator reduction (4), i?-linear con- 
vergence (6), as well as quasi-optimality (7) do not hinge on linearity of L. We thus 
bootstrap the arguments developed for the linear case to prove a quasi-Pythagoras the- 
orem (5) for nonlinear C (Proposition 18), and may derive convergence of AFEM with 
quasi-optimal algebraic rates. 

The remainder of this paper is organized as follows: For the sake of a clear presenta- 
tion, we first consider the linear case (2) with elliptic bilinear form ■) corresponding 
to the weak formulation of (1). This case already includes the main ideas of how to cope 
with compact perturbations. In Section 2, we explicitly state the assumptions on the dif- 
ferential operator L from (2), recall the continuous and discrete variational formulation 
of (1), and give the necessary details on the four modules of (3). Section 3 then provides 
the estimator reduction (4), which follows as in [11], and the quasi- Galerkin orthogonal- 
ity (5) which relies on the convergence of AFEM and compactness arguments. The short 
Section 4 proves i?-linear convergence (6) of the error estimator by use of (4)-(5). We 
stress that, so far, the analysis does neither hinge on the precise mesh-refinement used, 
nor on the adaptivity parameter chosen. By use of intrinsic properties of NVB, we then 
prove quasi-optimal convergence rates (7) in Section 5. A final Section 6 is concerned 
with extensions of our analysis. Amongst other topics, we discuss other boundary condi- 
tions than (lb) as well as changes of our analysis if the bilinear form 6(-, ■) satisfies only 
a Garding inequality. Subsection 6.5 bootstraps the arguments of the previous sections 
and incorporates the non-linear case (la) into the analysis. 

In all statements, the constants involved and their dependencies are explicitly stated. 
In proofs, however, we use the symbol < to abbreviate < up to a multiplicative constant. 
Moreover, ~ abbreviates that both estimates < and > hold. 

2. Model Problem & Adaptive Algorithm 

This section is devoted to state the model problem (1) with linear differential opera- 
tor (2) in weak form and to collect all the ingredients needed to formulate the adaptive 
algorithm. The presented problem is not the most general case on which the developed 
theory can be applied, but it allows for a rather simple presentation and illustrates the 
main difficulties of the problem. We refer to Section 6 for possible extensions and gener- 
alizations. 

2.1. Variational formulation. For a given right-hand side / G L^(r2), we consider 
the elliptic boundary value problem (1) with linear operator L from (2). For the weak 
formulation and to prove optimal convergence rates, we require some regularity assump- 
tions on the coefficients. We assume that A = A{x) E W^^^ with A E L°°{^1) is a 
symmetric matrix, 6 = b{x) E M.^ with b E L'^/^'^'^'^\Vl) is a vector, and c = c(x) E M 
is a scalar. This allows to write down the weak formulation of (1): Find 
u E Hliyt) := [v E H^iyi) : t;|r = in the sense of traces} such that 

h{u, v) := / AVu -Vv + b ■ Vn v + cuv dx = fv dx for all v E HI{VL). (8) 
Jvt Jn 

According to Sobolev's embedding theorem, there holds Hq{Q) C L^'^/('^~^^(r2). The 
bilinear form 6(-, ■) is therefore well-defined and bounded with 

\b{u,v)\ < Ccont||VM||L2(s^)||Vt;||L2(Q) for all u,v E Hq^Q), (9) 

where the constant Ccont '■= + ||&||L'*/('*+2){n) + l|c||L<i/2(f^) depends only on the 

coefficients of C. Additionally, we assume that the coefficients ensure that 6(-, ■) is elliptic, 

3 



I.e. 

b{u,u) > Ceii||Vn||i2(f^) for all u e H^{n) (10) 

for some constant Cdi > 0, see Section 6 if b{-, ■) satisfies only a Garding inequality. 

Now, the Lax-Milgram lemma guarantees unique solvability of (9) for all / G L'^{Q) 
and proves continuous dependence \\Vu\\l2^q) < \\f\\H-i{n) < llfWi^n)- Here, H'^^fl) := 
Hq{Q)* denotes the dual space of Hq{Q), and duality is understood with respect to the 
extended L^-scalar product, i.e. 

veHl,{n)\{o} II vw||L2(n) 

Moreover, the bilinear form b{-, ■) defines a quasi-norm \\\ ■ \\\ := b{-, ■)^^^, i.e. ||| ■ ||| is definite 
and homogeneous, but satisfies the triangle inequality only up to some multiplicative 
constant. Due to ellipticity and continuity of b{-, ■), it holds 

CnormllVt^lU^Cf^) < IIK'III < C'norm || || L2(f^) for all V G H^{^) (H) 

for a constant Cnorm = ^^.x{Cll'^^^, ^cn^"^} > ^■ 

2.2. Discrete formulation. For any regular triangulation Te oi Q (see Section 2.5 
below) and p > 1, we consider the piecewise polynomials 

V^iTi) := {V^ G L^(f2) : for all T & Ti, V(\t is a polynomial of degree at most 

as well as the conforming ansatz and test-space 

Now, the discrete formulation of (9) reads: Find Ui G Sq{T() such that 



6(C/,, Ve) = [ f Ve dx for all V, G S^{%). 
Jn 



(12) 



As in the continuous case (9), existence and uniqueness of Ui follows from the Lax- 
Milgram lemma. Moreover, there holds the Cea lemma 

||V(n-f/,)|U.(^)<^ mm \\V{u-Ve)\\LHn). (13) 



2.3. Error estimator. We use the standard weighted-residual error estimator with 
the local contributions 

ijeiTf := \TnC\TU, - + |T|^/i[AVf/, ■ n]^,^^^^^^ for all T G 7^, £ G N. 

Here, |T| is the d-dimensional volume of T G 7^, and [AVf/^ ■ n]\E '■= (AVf/^Iri) ' 
riTi + {-^^Ue\T2) ■ denotes the conormal jump over the facet E := Ti fl T2 for all 
Ti,T2 G Te, where n^i, denote the outward pointing normal units on the respective 
element boundaries. Note that due to the regularity assumptions on the coefficients, 
there holds CItUi G L'^(T) for all T & Te- The error estimator rji is defined as the £2-sum 
of the elementwise contributions 

TeTe 

As shown in e.g. [3, 29], the error estimator is reliable, i.e. for all regular triangulations 
% and corresponding solutions Ui of (12), it holds 

\\V{u-Ui)\\L2in)<CrelVi (14) 
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for a constant Crci > 0. Moreover, ?]£ is also efficient, i.e. 

C'sVe < l|V(u - U(>)\\L2(n) + osc^ (15) 

for a constant Ccs > and oscillation terms 

osc,^ := 5^ irmi - U^f'){C\TU, - /)||i.(^), 
T£Te 

where Ilf"^ : L'^{Q) — ?■ V^^^iTi) denotes the L^-orthogonal projection. The constants 
C'rei, C'cff > depend only on 7-shape regularity of 7? (see Section 2.5 below), the poly- 
nomial degree p > 1, and on VL. We stress that unlike [11, 12, 18], efficiency (15) is not 
used throughout our analysis. 

2.4. Adaptive algorithm. Now, we are in the position to formulate the adaptive 
algorithm (3) in detail. 

Algorithm 1. Input: Initial triangulation % and adaptivity parameter < 6 < 1. 
Loop: For i = 0,1, 2,... do (i) - (iv) 

(i) Compute discrete solution Ug of (12). 

(ii) Compute refinement indicators r]i{T) for all T E Te- 

(iii) Determine set Aii'^Ti of minimal cardinality such that 

emiTf< J2 ViiTf. (16) 

(iv) Refine (at least) the marked elements T G Me to obtain the triangulation T^+i. 
Output: Approximate solutions Ue and error estimators rji for all £ G N. 

2.5. Mesh refinement. Given an initial mesh % which is regular in the sense of 
Ciarlet, we construct the subsequent meshes Te by local refinement with the newest 
vertex bisection for simplicial meshes in M.'^, d > 2, see e.g. [29, Chapter 4] resp. [28]. 
Consequently, the set of meshes which can be obtained reads 

T : = I : 7^ is a refinement of To } . (17) 

The finite subset of meshes with at most TV G N elements more than the initial mesh is 
defined as 

T^:={7^GT : #7^-#ro<iV}. 

The meshes 7^ G T are regular in the sense of Ciarlet and 7-shape regular in the sense of 

^-i|2-|i/rf<diam(T) <7|r|i/'^ (18) 

for some 7 > 1 which depends only on To- A refined element T G 7^ is split into at least 
two sons, i.e. we have 

#(r.\7^)<#r.-#7^ (19) 

for all refinements 7^ G T of 7^ G T. As a key property for the optimality proof, the 
crucial closure estimate, for the meshes generated by Algorithm 1, is satisfied 

e-i 

#7^-#ro<C^esh forall^GN (20) 

j=0 

with some constant Cmesh > which depends only on 7o. For d > 3, % has to satisfy a 
certain condition on the reference edges, cf. [6, 28], while this assumption can be dropped 
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for d = 2, see the recent work [17]. Finally, for two meshes ?£, 7^ G T there is a coarsest 
common refinement 7^ © 7^ G T which satisfies 

#(7^ © %) < #r + #r' - #ro, (21) 

see [11, 27]. We stress that newest- vertex bisection is a binary refinement rule, and the 
coarsest common refinement 7^ © 7^ is just the overlay of both meshes. 

3. Convergence & Quasi-Orthogonality 

The aim of this section is to prove convergence, without relying on symmetry properties 
of C, which can be done by use of the concept of estimator reduction [2]. To that end, we 
define the subspace Sq{Toq) of Hq{Q) which is theoretically affected by Algorithm 1 as 



S'M:=[jS^M), (22) 

em 

where the closure is taken with respect to the if^-norm. With convergence Ue ^ u and 
hence u G Sq{Too) at hand, we are then able to prove a novel quasi- Galerkin orthogonality 
estimate (27), which is sufficient to prove linear convergence (30) as well as optimal 
convergence rates (35). 

3.1. Convergence. The following result is proved in [11] for symmetric £ and shows 
that the error estimator rji is contractive up to a certain perturbation. 

Lemma 2. There exist constants < gcst < 1 (^n-d Ccst > 0, such that there holds 

Ve+i < <lcstVi + Cest||V(f/,+i - Ue)\\l2(^a) for all i G N. (23) 

The constants gcst and Cost depend only on 9, '-y -shape regularity ofTe+i, the polynomial 
degree p G N, and on Q. 

Proof. The proof follows verbatim the proof of [11, Corollary 3.4]. Therefore, we give 
a rough sketch only. The application of Young's inequality 2ab < + 1)^ and standard 
inverse inequalities prove for 5 > 

ril, < (1 + 5) E {\Tf"mT'U, - + \Tf"\\ [AVU, ■ n]\\l.^,^,^^^ 

T'eT,+, (24) 

+ (i + ri)atab||v([/m-f/£)lli2(^). 

The constant Cgtab > depends only on 7-shape regularity of %+i and on p G N. Next, 
the sum is split into two sums over T' G 7£n7£+i and T' G 7^+1 \ 7^. We use the reduction 
of the element size \T'\ < \T\/2 for T' G T being a son of a refined element T & Ti\ 
Since M.e.^% \ Ti+i, one ends up with 

vl^<il + 5)(2-'/' ^ VeiTr+J2 ^^(^)')+(l + 5"')Cstab||V(f/m-f/.)lli.(f,) 
TaTi\Ti+i_ TeTenTe+i 

, frr\2\ , /I , llw/rr rr Ml 



<(i + (5)(2-i/^ 5^ r/,(r)2+ r/,(r)2) + (i + ri)atab||v(f/m-f/-'"' 



TeMe TeTe\Mi 

\ , /I , llrv/rr rr Ml 

L2(n)- 



< (1 + 5) ((2-1/^-1) J2 Ve{Ty + Vi)+{l + S'')at.4^{Ue+i-Ur' 



TeMe 

Finally, Dorfier marking (16) proves (23) with 

gest=(l-^(l-2-i/'^)(l + 5)G(0,l) and Cest = (1 + ri)Qtab 

for 5 > sufficiently small. □ 
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Adaptive algorithms of the type of Algorithm 1 with nested ansatz spaces SqITe) C 
iSq(7£+i) have in common that there holds a priori convergence. This has already been 
observed in the early work [10] and has later also been used in [22] to prove a general 
plain convergence result for AFEM. 

Lemma 3. The sequence of Galerkin approximations Ui of Algorithm 1 is convergent in 
HliVL), i.e. there exists u^o G '5o(7^) with 

Ue — )■ UoQ as i ^ oo. (25) 

Proof. The space Sq{Too) is a closed subspace of Hq{Q) and therefore the Lax-Milgram 
lemma guarantees existence and uniqueness of a solution Uoo G <Sq{7'od) of (12) with 
test space Sq{Too) instead of Sq{%). The Galerkin approximations Ui are also Galerkin 
approximations of u^o, since Sq{Tii) C 5q(7^) for all £ G N. Therefore, the Cea lemma 
shows 

||V(uoo - f/^)||L2{f7) < min || V(uoo - Vf)||L2(f7) ^ 

as £ — !■ oo. □ 

The combination of estimator reduction (23) and a priori convergence (25) yields con- 
vergence of Algorithm 1. 

Proposition 4. Algorithm 1 is convergent in Hq{Q), i.e. 

Ue^ueH^in) asi^oo. (26) 
In particular, this implies u = u^o € i5q(7^). 

Proof. According to Lemma 3, the estimator reduction (23) of Lemma 2 takes the form 

Vi+i < QestVi + o^e 

with a£ > and lim£_j.oo = 0. From this, elementary calculus proves lim£_^oo'7^ = 0, 
see e.g. [2]. Finally, reliability (14) of t]£ concludes the proof. □ 



3.2. Quasi-Galerkin orthogonality. The standard proof of the Pythagoras theorem 
— f/^+iIlp + Illt/^+i — UilW^ = \\\u — f/^IlP relies on Galerkin orthogonality and symmetry 
of &(■,■). The following lemmata provide a workaround for our case of a non-symmetric 
bilinear form &(■,■)• We stress that the quasi-orthogonality proof makes explicit use of 
the fact that we already have convergence [/^ — > u in Hq{Q) and u G Sq{Too)- 

Lemma 5. The operators A^K, : Hq{Q) — )■ H^^{Q) are bounded. Moreover, A is sym- 
metric and JC is compact. 

Proof. The symmetry of A is obvious, and both operators A and /C are also bounded, i.e. 

< ll^'i'l|L2(f7) < (ll&llLd/(d+2)(j^) + ||c||id/2(j^))||Vt;||L2(f7), 

for all V G Hq{Q). It remains to prove that /C is compact. The Rellich compactness 
theorem shows that the embedding l : Hq{Q) )■ L'^{Q) is a compact operator. Therefore, 
according to Schauder's theorem, see e.g. [30, Theorem 4.19], the adjoint operator l* : 
L^(f2) — )■ H~^{^1) is also compact. Obviously, t* : L^(r2) — )■ H~^{il) coincides with the 
natural embedding, and we may write 

/C = 6* o /C : H^{Q) L\Q) H-\Q). 
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Therefore, /C is the composition of a bounded operator and a compact operator and hence 
compact. This concludes the proof. □ 

Lemma 6. The sequences (e^)^^^ (^nd (i?^)^^^ defined by 

0, else, [_0, else, 

converge to zero, weakly in Hq{Q). 

Proof. We prove weak convergence of to zero. The weak convergence of Ei follows 
with the same arguments. Let (e^^) be a subsequence of (e^). Due to boundedness 
II Ve^J 1^2(5^) < 1 for all j G M, we may extract a weakly convergent subsequence (e^^^) of 
(e^J with 

First, note that u, Ui G Sq{Too) implies ei G Sq{Too) and hence w G Sq{Too)- Second, for 
all ij^ > e with ee^^ 7^ and all Ve G Sl^{Te), it holds 

b{ee^,.Ve) = || V(n - f/,^.J ||-,\^)6(n - f/,^.^ , V,) = 0. 

For any £ G N, Vf G iSq(7£), and s > 0, there exists /cq G N such that for all k > k^, it 
holds 

\b{w,Ve)\ = \ {w,C'V,)\<e+\{e,^^ , C'V,)\ = e + \b{e,^^,Ve)\ = e, 
since fco is chosen large enough such that ij^, > i. Therefore 

b{w, Ve) = for all G 5^(7^) and £ G N. 



Due to definiteness of b{-, ■) and w G Sq{Too) '■= U^n'^oI'^)' ^^is implies w = 0. Alto- 
gether, we have now shown that each subsequence of has a subsequence which converges 
weakly to zero. This immediately implies weak convergence ^ as £ — )■ 00. □ 

The previous lemma shows that although (i?^)^^^ is no orthonormal sequence, it shares 
the property of weak convergence to zero with orthonormal systems. Note that our proof 
already used convergence t/^ — )■ m as £ — )■ 00 in the sense that we required u — UiE SliToo)- 
This suffices to prove the following quasi-Pythagoras theorem. 

Proposition 7. For any < £ < 1, there exists £0 G N such that 

fUi+i - Uif < |||w - U,\f - |||w - [/,+i|f (27) 

for all£> io. 

Proof. Lemma 6 shows that e^, E"^ ^ as £ — t- 00. Due to Lemma 5, /C is compact. 
Therefore, we have strong convergence ICee, ICEi — )■ in H~^{Q) as i ^ 00 . This shows 

{K:{u - Ui+i) , Ui+i - Ui) = {ICci+i , f/f+i - Ui)\\V{u - Ui+i)\\L2^Q) 

< ||/Cef+i||/^-i(n)||V(M - Ue+i)\\L2^Q)\\V{Ue+i - f/£)||L2(n) 

as well as 

mUe+i -U,),u- Ue+i) = {KB, , u - Ue+i) || V(?7,+i - Ue)\\mn) 

< ||/CE^||H-i(n)||V(u - Ui+i)\\L2^n)\\^{Ue+i - Ue)\\L2(^^y 



For any S > 0, this may be employed to obtain some ^ such that for all i > io, 
holds 

|(/C(f/,+i -Ue),u- Ue+i)\ + |(/C(n - Ue+i) , f/m - Ue)\ 

< 5||V(n - f/,+i)||i2(^)||V(f/m - Ui)\\L^n). 
Together with Galerkin orthogonality 

= biu- V,+i) = {C{u - Ue+i) , y,H_i) for all 6 5o^( Vi), (28) 
we estimate 

\{C{Ui+i -U,),u- Ue+i)\ = \{A{u - Ue+i) , f/,+i - f/,) + (/C(f/,+i - U,) , u ~ Ue+i)\ 

< \{C{u - f/,+i) , -Ue)\ + |(/C(f/,+i - f/,) , n - f/,+i)| 

+ |(/C(n-f/,+i), f/,+i-f/,)| 

< 5||V(n - ^/,+l)||L^(^7)||V(f/m - Ui)\\L^n). 

(29) 

The definition of ||| ■ ||| and Galerkin orthogonality (28) yield 

|||n - Ue+i\f + |||f/m - Uef + (£(f/^+i - f/^) , n - f/,+i) = |||n - f/,|f , 

whence 

\\\Ui+i - U,\f < \\\u - Uef - \\\u - f/,+i|f + SCl,J\u - f/mlllll|f/m - Ue\\\. 

The application of Young's inequality 2ab < + }p and the choice e = conclude 
the proof. □ 

4. Contraction 

The quasi-Pythagoras theorem (27) from Proposition 7 allows to prove i?-linear conver- 
gence of the error estimator r/^. Compared with the analysis of the symmetric case [11], 
this is a weaker result. However, i?-linear convergence is still sufficient to prove quasi- 
optimal convergence rates in Section 5. 

Theorem 8. There exist constants < gconv < 1 Cconv > such that for all i,k ^N, 
there holds 

Vi+k — ^convQconvVl- (30) 

The constants qcom o-nd Cconv depend only on gest? Ccst, Cnorm, o-nd C^cX- 

Proof. We employ the estimator reduction (23) and reliability (14) to obtain for > l+l 
and a < 1 — gest 

AT N 

^ r/fc < 5^ {qeWk-l + Cest||V(t/fc - Uk-l)\\ 
k=i+l k=l+l 
N 

< E (('^-t + a)ril-i + Cest(||V(f/, - f/._i)||i.(f,) - aC;2C-,l\\V{u - f/._i)||i.(^))) , 

k=l+l 

Rearranging the terms in the above estimate, we end up with 

N N 

(1 - gest -a) ^ril<{l + gest + «)r/| + CestC^m {lUk - f/fc-if - Sfu - Uk-i\f). 

k=i+l k=e+l 

where S = CiCj^ciCestC^oj.-^- Next, we aim at proving that the sum on the right-hand side 
is bounded above by rjj for all N To that end, we employ Lemma 7 with e > such 

9 



that 1/(1 — e) < 1 + S. This gives a number £o ^ such that for all > £ > we may 
estimate 



TV N 



{\\\Uk-Uk^^f-5\\\u-Uk-l\f)< {iY^-S)\h-Uk-if-\\\u-Ukf) 



k=i+l 
N 



< Yl i\h-Uk-i\\\'-\h-Ukf) (31) 



k=i+l 

< \\\u - Uef < CL^Cl,vl 



For all i < io, we first observe that — t/^IH = implies |||f/fc — = for all k > £+1, 
since = u = t/fc-i- Therefore, we obtain with the convention oo ■ = 

Csup := sup ( lll^i - f/^lir^ - f/fc-i|f ) < oo. 



k=l+l 



t&{l,...M 

In combination with (31), we thus see 

N 

Y {\Wk - Uu-if - 5Wu - Uk^if) < (1 + aup)CLmCM for all £ e N, iV > L 

k=t+l 

Plugging everything together, we have so far shown 

oo 

XI ^fe < ^4 for all ^ e N, (32) 

fc=£+l 

for some constant C > which depends only on ijest, C'est, C'norm, and Crci- Therefore, we 

get 

oo oo oo 

k=e+l k=i+l k=t 

and hence by induction 

oo oo 

< E ^ (1 + ^"')"' Y.'^k<i^ + C){l + C'^-'Vi for all i.keN. 

k=e+j k=i 

This concludes the proof with gconv = 1/(1 + and Cconv = (1 + C). □ 

Remark. Note that the R-linear convergence of Theorem 8 holds for arbitrary adaptivity 
parameters < 6 < 1. Moreover, the result is independent of NVB in the sense that the 
proof only requires that \T'\ < q\T\ for some < q < 1 and all sons T' G T of refined 
elements T ^ T(, \ Ti+i- This property holds for each feasible mesh-refinement strategy 
and for NVB with q = 2~^^'^. Finally, the minimal cardinality of the set Aie of marked 
elements has not been used, yet. Instead, Theorem 8 holds as long as the set Aie ^ % 
satisfies the Dorfler marking (16) and, in particular, for Aig = %■ □ 

Remark. Note that the proof of Theorem 8 does neither use linearity nor uniform 
ellipticity of C. Instead, we only require reliability (14), estimator reduction (23), quasi- 
Galerkin orthogonality (27) as well as equivalence (11) of the norm || V( ■ )||l2(q) and the 
energy quasi-norm \\\ ■ \\\ on Hq{Q). With these ingredients, our analysis is thus also 
capable to cover certain nonlinear problems as discussed in Section 6.5. □ 

10 



5. Optimal Convergence Rates 



With Theorem 8 at hand, we are in the position to prove quasi-optimal convergence 
rates for the sequence of Galerkin solutions obtained from Algorithm 1. First, however, 
we have to clarify what is the best possible convergence rate that can be aimed at. To 
that end, we follow e.g. [11] and define the approximation class by 

(n,/)GA, 4^ ||(n,/)|U. :=supiVV(iV;n,/)<oo (33) 

TVeN 

for all s > 0, where 

a{N;uJ):= inf inf (|| V(m - K) Hia.m + osc^) 

and osc* is the oscillation term from (15) corresponding to the mesh 7^. We refer to [7, 16] 
for a characterization of approximation classes in terms of Besov regularity. However, in 
this work, we follow [1] and use an equivalent definition of A^, which involves the error 
estimator rji only. Due to reliability (14), efficiency (15), osc^, < ?7^, and the Cea lemma, 
the total error is equivalent to the error estimator, i.e. 

Hence, A^ from (33) can equivalently be characterized as 

(m,/) e a, ^ II(m,/)||a, := sup inf N'r]^ < oo (34) 

for all s > 0. In our opinion, this characterization allows for a clearer presentation of 
the proof of the following quasi-optimality theorem and, in particular, we shall see that 
unlike the analysis of [11, 12, 18, 27], the upper bound for optimal adaptivity parameters 
< 6* < 1 does not depend on the efficiency constant Ceg. The following result is the 
main theorem of this section. 

Theorem 9. Define 6^ := (l + CgtabC'dRei)"^ with the constants C^rcI > from Lemma 10 
and Cstab > from the proof of Lemma 2. Then, for all adaptivity parameters < 6 < 6^, 
and all s > 0, there exists a constant Copt > such that 

(n,/)GA, ^ r/£<Copt||(n,/)|U.(#7^-#ro)-^ forallieK (35) 

The constant Copt depends only on 6, s, Qconv, Cconv, and Cmcsh? (ind the proof relies on 
the properties (18)-(21) of NVB. 

For the proof of the quasi-optimality theorem, we need a refined reliability property of 
the error estimator rjg. 

Lemma 10 (discrete reliability). There exists a constant CdRd > such that for all 
refinements of a triangulation % ET, it holds 

l|V(t/.-f/.)||i2(^)<CdRei ^^(^)'- (36) 

TeTe\T, 

The constant C^Rd depends only on the ^-shape regularity of To, the polynomial degree 
p G N, and on Vt. 

Proof. The statement is proven for 6 = and c > in [11, Lemma 3.6]. The proof for 
the present case follows verbatim. □ 

So far, we have observed that Dorfier marking (16) implies contraction of rj^ (Proposi- 
tion 8). Now, we prove, in some sense, the converse. We follow the concept of proof of [1] 
and stress that unlike e.g. [11, 12, 18, 27] our proof does not use efficiency (15) of rj^. 
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Lemma 11 (Optimality of Dorfler marking). Let < 9 < 9^, := (1 + CstabC'dRci)~^- Then, 
there exists < go < 1 such that for all refinements Ti, & T of a triangulation 7^ G T the 
following statement is true 

ril<qW, =^ 9r,j< ^,{Tf . (37) 

Proof. Analogously to (24), we estimate for 5 > 

< E Vi{Tf + {^ + S~') E V*{Tf + {^ + S)Cst.4'^{U.-U, 



I -\ I m tt \ ii 

L2(Q)- 



\L^n) (38) 



< E ^^(T)' + (l + ri)gDr/| + (l + 5)atab||V(f/.-t/,^"2 



Rearranging the terms and employing the discrete reliability (14), we end up with 

l-(l+5~l)gD 2^ 

1 + (1 + ())CstabCdRel ^i^f^ 

^ fc le\i* 

According to ^ < (1 + CstabC'dRci)"^, we may finally choose S > and < go < 1 
sufficiently small to ensure 

1-(1+%D ^ 1 



l + (l+(5 "'^)CstabC'dRel 1 + C'stabC'dRel 

This concludes the proof. □ 

Now, we are in the position to prove Theorem 9. We stress that the concept of proof 
goes back to [27] and has been adopted by [11] and all succeeding works. We put emphasis 
on the fact that, first, efficiency (15) of rj^ is not needed and that, second, /2-linear 
convergence (30) instead of plain contraction in each step of the adaptive loop is sufficient. 



Proof of Theorem 9. Let A > denote a free parameter, which is fixed later on. The 

7l 



definition of the approximation class allows for given := Ar/f > to choose a mesh 



Te ET such that 

^,<e and #r. - « < || (n, /) HyX^^ 

Now, consider the overlay % := % (B % and argue similarly to (24) to see 

<^.'+l|V(f/.-f/.)||i.(^)<r/,^<Ar/,^ 

where we used the definition of e > 0. We choose A > sufficiently small such that 
Lemma 11 is applicable and conclude that %\% satisfies the Dorfier marking (16). By 
definition of step (iii) of Algorithm 1, the set A^^ of marked elements is a set of minimal 
cardinality which satisfies the Dorfier marking. Therefore, we obtain by use of (19) 
and (21) 

#-M. < #(r. \ %) < #r. #r. - < \\{u, /)iil/>-^/^ 

<IIK/)llllV^^ 
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for all £ G M. Finally, the closure estimate (20) and the contraction (30) of Proposition 8 
yield 

j=o j=0 j=0 

Exploiting the convergence of the geometric series, we end up with 

Ve < II(«,/)IIa.(#7^ - #ro)-^ for all i G N. 

Altogether, this proves that each theoretically possible convergence rate for the estimator 
is, in fact, asymptotically achieved by the adaptive algorithm. The converse implication 
in (35) is obvious. This concludes the proof. □ 

Remark. We stress that the proof of Theorem 9 depends only on properties (18) -(21) 
of NVB, R-linear convergence (30) of the estimator used and the discrete reliability (36). 
In particular, there is no explicit use of the properties of the differential operator C, i.e. 
neither linearity nor uniform ellipticity is required. □ 

6. Extensions 

In this section, we want to discuss some possible extensions of our analysis. 

6.1. Minimal cardinality of marked elements. The choice of the set of marked 
elements A^^ in step (iii) of Algorithm 1 to be a set of minimal cardinality which satisfies 
the Dorfier marking (16), requires to sort the set |?7£(T) : T G 7^}, which takes at least 
C^(#7^ log(#7^)) operations. In comparison to 0{^7i) operations for iterative solvers 
on sparse matrices, marking becomes the bottleneck of Algorithm 1. To overcome this 
problem, we may allow the set A^^ to be of almost minimal cardinality in the sense of 

^Me < Ci^Mi for all £ G N, (40) 

where A^^ is a set of minimal cardinality which satisfies Dorfier marking and C > is an 
arbitrary but fixed constant. All the proofs hold true up to an the additional factor C, 
which is involved in (39). The relaxation (40) allows to apply an inexact sorting algorithm 
based on binning of the data (see e.g. [20]) which performs in 0{jj^Ti) operations. 

6.2. Other mesh-refinement strategies. Instead of simple newest-vertex bisection, 
one can consider other mesh- refinement strategies which satisfy (19)-(21), since no other 
property of the mesh refinement strategy is used throughout this paper. In particular, 
one could use up to m newest vertex bisections per marked element, where m G N is 
a fixed number, cf. e.g. [18]. This includes the strategy proposed in [12] which uses 
additional bisections every n-th step to ensure the interior node property and hence to 
obtain a discrete lower bound on the error. Moreover, one can relax the regularity of the 
triangulations used and allow a fixed number of hanging nodes in each triangle T G 7^ [8]. 



6.3. Inhomogeneous Dirichlet data. Let S^{Te) := V^{%) fl H^iVt) with discrete 
trace space Sp{%\t) ■■= {Ve\r ■■ Vi G 5^(7^)}. We consider inhomo geneous Dirichlet 
data g G H^/'^{T) and an H^/^-iiiah\e projection : H^/'^{T) >SP(7^|r), for instance 
the Scott-Zhang projection [26] for p > 1 or the L^-projection for p = 1 (see [17] for 
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iif^-stability on NVB refined meshes). The continuous problem we want to solve, now 
reads: Find u G H^{Q) with u\dQ = g such that 

{Cu , v) = b{u,v) = [ fvdx for all v G H^in). (41) 
Jn 

The corresponding discrete formulation reads: Find f/^ G S^{%) with U^\y = Peg such 
that 

biUi, Vi) = [ fVedx for all Vi G S^iTi). (42) 
Jn 

Well-posedness of (41)-(42) is well-known and discussed, e.g., in [1, 4, 24]. The approxi- 
mation error which is introduced via g ~ Peg results in an additional error quantity. We 
assume regularity g G H^(X) and define the Dirichlet data oscillations 

osCg,e := diam(E)||Vr(l - Pe)g\\l2(^E), 

where Vr( ■ ) denotes the surface gradient on F = dQ. 

Since the ansatz spaces are no longer nested, i.e. Ue+i — Ue ^ S^lTe), we have to rely 
on a modified marking strategy proposed in [27]. We replace the Dorfier marking (16) 
by the following separate marking strategy with adaptivity parameters < ^,^9 < 1: 

• If osc^£ < i^r/^, determine Aie C 7^ as a set of minimal cardinality which satis- 
fies (16). 

• If osCg£ > ^??7£, determine Aie C 7^ as a set of minimal cardinality which satisfies 

^osc^^ < J2 osCg^eiTf. (43) 

Now, the analysis of [1] can easily be transfered to the present problem as well, where rje 
in (23), (30), and (34)-(35) is replaced hj pe := rje + oscg/. For usual choices of Pe as 
above, one obtains convergence of AFEM by means of the estimator reduction principle [1, 
Theorem 4]. Moreover, for arbitrary Pe and sufficiently small marking parameters < 

^ < 1, we obtain the optimality result of Theorem 9, cf. [1, Theorem 6]. 

For d = 2, one may even use nodal interpolation to discretize the inhomogeneous 
Dirichlet data. Then, the combined Dorfier marking (16) for p£ := rji + osCg^e instead of 
ri£ yields the contraction result of Theorem 8. Moreover, for sufficiently small < ^ < 1, 
Theorem 9 remains valid. We refer to [15] in case of symmetric £ = — A and stress that 
the analysis can easily be transfered to the present setting. 

6.4. Coercive but not uniformly elliptic bilinear forms. Assume that instead 
of ellipticity (10), there holds a Garding inequality 

b{u, u) + Cgard||^|li2(f^) > Pgardll VM[|i2(n) for all u G H^{n) (44) 

with constants < Pgard < 1 and Cg^rd > We have to assume that 6(-, ■) is definite on 
the continuous level, i.e. for all v G Hq{Q), it holds 

b{v, w) = for all w G H^iCl) =^ v = 0, (45a) 
b{voo-,Woo) = for all Woo e 5o(Tx)) =^ foo = 0. (45b) 

This together with Fredholm's alternative already guarantees the unique solvability of (9) 
and (12) with test and ansatz space iSq(7^) instead of iSq(7£). 
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Remark. Usually, the conditions (45) are guaranteed under the assumption that the 
mesh-size of the initial mesh To is sufficiently small and that the solution w G Hq{Q) of 
the dual problem 



/ fvdx for all v G Hq{Q) 
Jn 



b{v, w) = 

satisfies some regularity estimate 

WwWm+s^Q) < WfllL^n) for some s > 0, 
see e.g. [9, Theorem 5.7.6]. □ 
Now, we may apply [25, Theorem 4.2.9] to obtain the following result. 

Lemma 12. There exists an index io E N such that for all i > io the discrete formula- 
tion (12) is uniquely solvable, and it holds 

||V(noo - f/^)||L2(n) < C'cea min \\V{uoc - Ve)\\L^n), (46) 

where Uoo G iSq(7^) denotes the unique solution of (12) vuith Sq(Toq) instead of Sq(T£). 

Proof. Since (44) states that b{u, 'y)+Cgard('" ) i^)L^{n) is elliptic and (■ , ■)l2(q) is a compact 
perturbation, we apply [25, Theorem 4.2.9] on the Hilbert space Sq{Too) and the dense 
sequence of subspaces Sq{T() for i — )■ oo. □ 

The above lemma allows to prove a priori convergence from Lemma 3 and consequently 
convergence f/^ — n in Hl[Vi) as well as n G Sq{Too)- Moreover, Lemma 6 still holds 
true, since we assumed definiteness of b{-, ■) on Sq{Too) in (45b). 

Lemma 13. There exists an index £i G N such that for all i > ii there holds 

||V(n - Ui)\\L2(^n) < C'normlll^^ " U4\ and ||V(f/^+l - Ui)\\L2(^n) < Cnorm\\\Ul+l - Ul\\\. 

Proof. With (44) and 6(-, ■) = ||| ■ we may estimate 

l|V(n - t/^)||i2(f,) < |||n - U,\f + ||n - UeWl^^^) 

= \h - Ui\f + ||e£||i2(j^)||V(n - Ui)\\l2^^y 

Lemma 6 shows weak convergence ^ in Hq{Q). The Rellich compactness theorem 
thus implies strong convergence — )■ in L^(f2). Therefore, there exists an index £i G N 
such that there holds 

l|V(n - f/,)||i2(f,) < |||n - f/,|f for all £ > h. 

The statement for Ui^i — Ue follows analogously. □ 

Lemma 6 together with Lemma 13 allows to prove the quasi- Galerkin orthogonality of 
Proposition 7 and consequently also the i?-linear convergence of Theorem 8. Therefore, 
all the results from Section 5 hold and, in particular, we obtain the optimality result of 
Theorem 9. 

6.5. Non-linear operators C. We consider the following non-linear operator 

Cu{x) := — divA(x, Vm(x)) + g{x, m(x), Vm(x)), 

for functions A : and g : f2x]RxM°' — )■ M. We assume that A(-, 'Vu),g{-, n, Vn) G 

L^(f2) for all u G Hq{Q). Then, the weak formulation of (1) reads: Find u G Hq{Q) such 
that 

{Cu , v) = / A{x,'Vu{x)) ■ 'Vv{x) -\- g{x,u{x),'Vu{x))v{x) dx = / fvdx (47) 
Jn Jn 
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for all V G HI{VL). We define two auxiliary operators K, : Hq{Q) — )■ H as 

Av := -divA(-, Vv) and JCv := g{-, v, Vv) for all v G H^{^). 
The solvability and uniqueness of (47) is part of the next section. 

6.5.1. Regularity assumptions. We consider the frame of strongly monotone op- 
erators and require the following regularity assumptions on C: 

\\A{;Vw) - A(-, Vi;)|U2(n) < Ci^\\V{w - v)\\L2^n), ^^^^ 
\\g{-,w,Vw) - gi-,v,Vv)\\L2(n) < Ciip||V(w; - f)|U2(n) 

for all w,v & Hq{Q) and some constant Cup > as well as 

{Cw~Cv,u-v)> C^onll V(W7 - V)\\l,(^^) (49) 

for all w,v E Hq{Q) and some constant Cmon > 0. These assumptions allow to apply 
the main theorem on strongly monotone operators [31, Theorem 26. A] and to obtain the 
unique solvability of (47) as well as of (12). Additionally, (48)-(49) guarantee that the 
norms of the residual and the error are equivalent, i.e. 

\\Cu-CU,\\H-iin)-\\^{u-U,)\\L2^n) for alH G N. (50) 

We also obtain the Cea lemma (13) with the constant 2Ciip/Cmon- 

Moreover, we require A : 1] x M'^ — )■ M'^ to be Lipschitz continuous and C : Hq{Q) — )■ 
H^^{Q) as well as A : Hq{^) H''^{Q) to be twice Frechet differentiable, i.e. there 
exist 

DC, DA : H',{n) ^ L{H',{n), H-\n)), 
D'^CD^A-.Hl^in) ^ L{H^{n),L{H^{n),H-\n))). 

The second derivative should be bounded locally around the solution u of (47) i.e., there 
exists Eioc > with 



sup U'^^^''^"L(^,i(n),L(^.,i(n),H-i(n))) 
V («-■«) 11^2 (n)<£&c V " ^ 0^ " ^ ^^2^ 



Ceoc-= sup (\\D'^C{v 

\\\7 (tl—lAW ^tr. V 

Finally, we assume that DA{v) : Hq{Q) — )■ H~^{Q) is symmetric for all v G H^{Q). 

Remark. Note that if A : f2 x M"' — )■ R"' and g : fixMxM"'— j-M are twice differentiable, 
and if V yA{x,y) G W^^'^ additionally is a symmetric matrix, then C and A satisfy (51) 
as well as (52). Moreover, DA{v) is symmetric for all v G HI{VL). □ 

Example. We stress that the assumptions on A and C posed, cover for instance non- 
linear material laws in magnetostatics , where e.g. A{-, ■) takes the form 

A(..V«W)=(l + :^-^|J^)v,.W. 

We refer to e.g. [23] for further examples. □ 



6.5.2. Auxiliary results. This section provides some technical lemmata, which are 
used to transfer the results from the linear case to the present non-linear case. 
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Lemma 14. The residual error estimator 

4--= E (l^l'^'ll^lTt^^-/lli^(T) + |Tr/1[A(-,Vf/,)-n]||i.(,^,^)) (53) 

is well-defined and satisfies reliability (14), discrete reliability (36), and estimator reduc- 
tion (23). Moreover, there holds convergence 

||V(m - f/^)||i2(n) ^ asi^oo. (54) 

Proof. The Lipschitz continuity of A : x M'^ -)■ R*^ implies divA(-, ■) e L^(fi). There- 
fore, the residual error estimator t]£ is well-defined. With the equivalence (50), the stan- 
dard arguments apply to prove reliability (14) and also the proof of discrete reliability (36) 
follows analogously to [11]. Finally, with Lipschitz continuity of A : f2 
and (48), the proof of estimator reduction in Lemma 2 remains valid. Therefore, Propo- 
sition 4 holds true and proves convergence (54). □ 

Lemma 15. The operator {DC)\gP(^j-_^'jU : Sq(Toq) — ?■ Sq(Too)* is injective. 

Proof. With (49) and the definition of the Frechet derivative, there holds for all v G 
<Soi%c) with \\Vv\\l2(^q) = 1 

{iiDC)\sP^r^)u){v) , v) = \im 6-^ {C{u + 6v) - Cu,u + 6v -u) 

>r2||V(n + 5t;-n)||i.(^) = l. 

Hence, we have {{DC)\gP(^j-_^-ju){v) 7^ in 5o(7^)* for all v G 5q(7^)\{0}. This concludes 
the proof. □ 

Lemma 16 (Taylor). For allv.w G HliVt) with \\V{u — v)\\L2fji^ + \\V{u — w)\\L2{yi) < e^oc, 
there holds 

\\Cw - Cv - DC{w - v)\\h-^(si) < Ci>oc\\S/{w - f)||i2(Q), (55a) 

\\Aw - Av - DA{w - v)\\H-iin) < Cfoc||V(ty - v)\\l2^^y (55b) 

Proof. The local boundedness (52) together with [13, Theorem 6.5] applied to the oper- 
ators C and A proves the statement. □ 

6.5.3. Quasi-orthogonality. Following the steps of Section 3.2, we derive a similar 
result for the present, non-linear case. 

Lemma 17. The sequence (e£)^gN defined by 

^ foru^U,, 



l|V(n-C/,)||^2(n) ' 

0, else 



converges to zero, weakly in HqIQ). 

Proof. With Galerkin-orthogonality and the convention 00 ■ = 0, we obtain 
hm j^'''^^^^'^ = for all G S^^iTk) and k G N. 

By continuity of the duality brackets, this results in convergence for all v G 5q (7^ 

{Cu — CUi , v) 



|V(M-f/,)|U2(f,) 
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—J- as ^ — 00. 



By use of (55a), we observe for all v G iSo(7^) 

\{Cu-CU,,v)\ ^ \{{DCu){u-U,),v)\ 

- \\y7fo FTTH Wocll V(U - Ui)\\L2(^<^)\\Vv\\L2(^f^y 



\\Viu-Ue)\\mn) ~ \\^{u-Ue)\\mn) 
With convergence f/^ — n in Hq{Q) from (54), this implies immediately 

1 ' / r \n ->0 asf^cx) for all i; G 5o^(roo). (56) 

||V(u - Ui)\\L2{n) 

According to Lemma 15, {DC)\sP(^j-^)U is injective. Therefore, its adjoint {{DC)\sP(^j-^)uY 
is surjective onto Sq{TooY- Hence, (56) is equivalent to ^ as ^ — oo. This concludes 
the proof. □ 

To abbreviate notation, we define the quasi-metric 

d{w, vY := {Cw — Cv , w — v) for all w,v E Hq{Q). 
Note that due to (48)-(49), there holds 

Cnorml|V(w - v)\\L2(n) < d{w , v) < Cnorm||V(M; - v)\\L2(^n) for all w^vG H^{n) (57) 
with Cnorm = max{2Ciip, C'lJ > 0. 

Proposition 18. For any e > 0, there exists G N such that 

d{Ue+u Uef < ^ dl(n, UeY - dl(n, Ue+iY (58) 

for alli> io. 

Proof. Due to convergence Ue —> u in Hq{Q) (54), there exists £i G N such that for all 
i > ii we may apply (55b), to obtain 

\{AU,+, -AU,,u- f/,+i)| < \{DA{U,+iYUe+i -U,),u- f/,+i)| 

+ C,ojV([/,+i - t/^) 11^2(0) II V(n - Ut+i)\\ 

Using the symmetry of DA{Ue+i), we conclude 

\{AU,+, -AUe,u- f/,+i)| < |(D^([/,+i)(n - Ue+i) , f/,+i - f/,)| 

+ C,.,||V(t/,+i - U,)\\l2^a^\\V{u - Ue+i)\\LHn) 

< \{Au - AUe+, , Ui+i - Ue)\ 

+ C,oc\\V{Ue+i - f/^)||i2(f,)||V(n - Ue+i)\\mn) 

Analogously to the estimate above, we obtain a lower estimate. For any 5 > 0, we may 
thus use convergence [/^ — )■ m as £ — oo to find an index £o ^ such that 

1 1 {AUi+i - AUe , u - Ui+i)\ - \ {Au - AU^+i , f/^+i -U^)\\ 

< 5\\V{U,+i - f/^)||L2(n)||V(M - f/^+l)||L2(Q) 

for all £ > Iq. Since converges to zero weakly in ifQ(f2), we have strong convergence 
— )■ as £ — >■ oo in L'^iVt). This together with Lipschitz continuity (48) allows to 
estimate 

|(/C[/^+i -ICUe, u-Ue+i)\ < \\V{Ue+i - Ui)\\L2(n)\\ei+i\\L2{n)\\V{u - Ui+i)\\L2(n) 
and hence 

|(/Ct/,+i -ICUe,u- t/,+i)| < (5||V(f/,+i - Ue)\\LHn)\\^{u - t/,+i)||L2(o) 
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for all i > ii- The adjoint term follows analogously, since 

\{ICu - ICUe+i , Ue+i - < \{ICu - m+i , t^m - + - m+i , u - Ue)\. 
So far, we end up with 

I (/C[/,+i -}CUe,u- Ue+i) | + | (/Cn - JCUe+i , f/^+i - Ue) \ 

< (5(||V([/,+i - ?7,)lU2(f,)||V(M - ?7,+i)|U2(^) 

+ llv(M-f/,+i)|ii.(f,) 

+ ||V(m - f/m)llL2(f7)l|V(M - f/£)|U2(f,)) 

< 5/211 V(^,+i - f/,)||i.(^) + 25||V(« - f/m)lli.(n) 

+ 5/211 V(«-f/,)||i.(^) 

by use of Young's inequality. Putting everything together, we obtain 

\{{A + IC)Ue+i - {A+IC)Ue , u - Ue+i)\ 

< \ {Ati - AUi+i , f/,+1 -Ue)\+ 6\\V{Ui+i - V(m - Ue+^)\\L2^n) 

+ |(/Cf/,+i-/Cf/,,M-t/,+i)| 

< \{{A + JC)u-iA + }C)Ui+i,Uui-Ue)\ 

+ (5||V(t/,+i - f/£)|U2(Q)||V(n - f/£+i)|U2(f,) 

+ |(/Cf/,+i - /Cf/, , n - f/,+i)| + |(/Cn - /Cf/,+1 , - 

< 3<5(||V(f/,+i - f/,)||i.(^) + ||V(n - Ue+^)\\l.^n) + l|V(n - f/,)||i.(f,)), 

where we used Galerkin orthogonality {{A + IC)u — {A + JC)Ui^i , Ui+i — Ui) = to obtain 
the last estimate. With that at hand, we obtain similarly to (29) 



dl(f/,+i, UeY < d{u, UeY - dl(M, Ue+iY + \{iA + /C)f/,+i -iA + 1C)U, , u - f/,+i) | 
< dl(M, U,f - dl(M, U,+iY + 3(5(11 V(t/,+i - f/^) 

+ ||v(« - t/.+i)||i2(^) + \\v{u - 

With the equivalence (57), we conclude 

(1 - 3C„orm5)dl(t/,+i, U,f < (1 + 3Cnorm5)dl(M, U,f - (1 - 3Cnorm5)dl(M, ?7,h_i)^ 

for all ^ > io- Finally, we choose 5 > sufficiently small such that (1 + 3Cnorm5)/(l — 
3Cnorm5) < 1/(1 — s) and conclude the proof. □ 

Together with estimator reduction from Lemma 14, the quasi- Galerkin orthogonal- 
ity (58) of Proposition 18 allows to prove the i?-linear convergence of Theorem 8, if one 
exchanges \\\u — and — with (l{u,Ue+i) and (l{Ue+i,Ue), respectively. 

Therefore, all the results from Section 5 hold (cf. the remarks after Theorem 8 and the 
proof of Theorem 9) and, in particular, we obtain the optimality result of Theorem 9. 
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